SENSEMBERT: context-enhanced sense embeddings for multilingual word sense disambiguation

Scarlini, Bianca; Pasini, Tommaso; Navigli, Roberto

doi:10.1609/aaai.v34i05.6402

Contextual representations of words derived by neural language models have proven to effectively encode the subtle distinctions that might occur between different meanings of the same word. However, these representations are not tied to a semantic network, hence they leave the word meanings implicit and thereby neglect the information that can be derived from the knowledge base itself. In this paper, we propose SENSEMBERT, a knowledge-based approach that brings together the expressive power of language modelling and the vast amount of knowledge contained in a semantic network to produce high-quality latent semantic representations of word meanings in multiple languages. Our vectors lie in a space comparable with that of contextualized word embeddings, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbour approach. We show that, whilst not relying on manual semantic annotations, SENSEMBERT is able to either achieve or surpass state-of-the-art results attained by most of the supervised neural approaches on the English Word Sense Disambiguation task. When scaling to other languages, our representations prove to be equally effective as their English counterpart and outperform the existing state of the art on all the Word Sense Disambiguation multilingual datasets. The embeddings are released in five different languages at http://sensembert.org

SENSEMBERT: context-enhanced sense embeddings for multilingual word sense disambiguation / Scarlini, Bianca; Pasini, Tommaso; Navigli, Roberto. - (2020), pp. 8758-8765. (Intervento presentato al convegno National Conference of the American Association for Artificial Intelligence tenutosi a New York, USA) [10.1609/aaai.v34i05.6402].

SENSEMBERT: context-enhanced sense embeddings for multilingual word sense disambiguation

Bianca Scarlini^Primo;Tommaso Pasini^Secondo;Roberto Navigli^Ultimo

2020

Abstract

Contextual representations of words derived by neural language models have proven to effectively encode the subtle distinctions that might occur between different meanings of the same word. However, these representations are not tied to a semantic network, hence they leave the word meanings implicit and thereby neglect the information that can be derived from the knowledge base itself. In this paper, we propose SENSEMBERT, a knowledge-based approach that brings together the expressive power of language modelling and the vast amount of knowledge contained in a semantic network to produce high-quality latent semantic representations of word meanings in multiple languages. Our vectors lie in a space comparable with that of contextualized word embeddings, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbour approach. We show that, whilst not relying on manual semantic annotations, SENSEMBERT is able to either achieve or surpass state-of-the-art results attained by most of the supervised neural approaches on the English Word Sense Disambiguation task. When scaling to other languages, our representations prove to be equally effective as their English counterpart and outperform the existing state of the art on all the Word Sense Disambiguation multilingual datasets. The embeddings are released in five different languages at http://sensembert.org

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2020
			
	Nome convegno
	
				National Conference of the American Association for Artificial Intelligence
			
	Parole chiave
	
				natural language processing; word sense disambiguation; multilinguality
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				SENSEMBERT: context-enhanced sense embeddings for multilingual word sense disambiguation / Scarlini, Bianca; Pasini, Tommaso; Navigli, Roberto. - (2020), pp. 8758-8765. (Intervento presentato al  convegno National Conference of the American Association for Artificial Intelligence tenutosi a New York, USA) [10.1609/aaai.v34i05.6402].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
scarlini_sensembert_2020.pdf accesso aperto Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 558.56 kB Formato Adobe PDF	558.56 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1349857

Citazioni

ND

96

40

Catalogo dei prodotti della ricerca